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A  first-ever  design  for  integrated  solar  cells  in  commercial  CMOS  is  presented.  Two  prototype  designs  have  been  designed,  fabricated,  and 
tested.  The  average  efficiency  of  the  first  prototype  is  2.4%,  compared  to  an  estimated,  but  unverified  1%  from  previous  work.  The  actual 
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1  Introduction 


This  report  satisfies  the  final  requirement  of  BOARD  Grant  FA8655-06-1-3053.  The  work 
presented  in  the  report  was  carried  out  from  the  15*  September  2006  to  29*  February  2008  (17.5 
months).  A  5.5-month  extension  (15*  September  to  29*  February)  was  granted  at  no  cost  due  to 
chip  manufacturing  delays. 

1.1  Scope  and  Objectives 

A  new  dimension  of  system  architecture  design  is  emerging  where  hundreds  to  thousands  of  ultra¬ 
light  (<10g)  sensor  nodes  will  collectively  perform  a  spectrum  of  remote  sensing  missions  in  a 
distributed  fashion.  To  support  this  architecture,  high  volume  production  of  sensor  nodes  at  low 
cost  is  required. 

This  basic  research  project  is  aimed  at  the  development  of  a  technique  to  design  and  fabricate 
self-powered  wireless  sensor  nodes  monolithically  with  commercially  available  complementary 
metal-on-silicon  (CMOS)  technology.  The  goal  is  to  realize  a  novel  system-on-a-chip  (SoC) 
component  integration  on  a  single  silicon  die.  Until  now,  integration  of  optical,  radio  frequency 
(RF),  solar  power,  and  data  handling  technologies  have  necessitated  the  use  of  other  system-level 
integration  approaches  such  as  system-in-package  (SiP),  multi-chip  module  (MCM),  wafer-scale 
integration  (WSI).  These  approaches  have  been  used  in  the  DARPA-sponsored  “Smart  Dust” 
effort.  A  feasibility  study,  using  the  space  application  as  an  example,  had  already  been  completed 
showing  great  promise  for  the  project  [6]  .  The  feasibility  study  highlighted  that  optical  sensors, 
solar  power,  wireless  communication,  and  data  processing  can  conceivably  be  integrated  on  one 
CMOS  die.  This  preliminary  work  is  directly  fed  into  the  work  described  here. 

Defense  interests  parallel  academia,  where  this  technology  could  be  potentially  used  to  support  a 
variety  of  military  missions.  New  terrestrial,  atmospheric,  and  space-based  missions  have  been 
envisioned  for  distributed  remote  sensing  networks.  Potential  missions  include:  signals 
intelligence,  environment  monitoring,  close  inspection,  and  numerous  other  envisioned  and  yet- 
to-be  envisioned  applications. 

Finally,  the  novelty  of  this  work  clearly  stands  out  in  the  literature — no  one  has  ever  before 
integrated  a  sensor  technology  with  wireless  communication,  data  processing,  and  solar/self- 
powered  technology  integrated  on  one  CMOS  die.  This  technology  will  meet  the  demand  created 
by  the  recent  explosion  of  distributed  mission  proposals  over  the  past  decade — new  aerospace 
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applications  alone  have  increased  by  850%  over  the  last  decade.  However,  no  system  architecture 
exists  yet  to  support  them. 

Objectives: 

1.  Develop  key  technologies  suitable  for  mass-production  of  integrated  heterogeneous  self- 
powered  SoC  sensor  nodes  to  enable  distributed  missions  for  non-benign  environments. 

2.  Experimentally  verify  selected  key  emerging  technologies  on  commercial  bulk-CMOS  to 
demonstrate  the  approach  via  manufacture  and  testing  of  VLSI  circuits. 

1.2  Project  Milestones 

The  project  milestones  are  listed  below: 

15  September  2006  Contract  Awarded 

3-10  March  2007  IEEE  Aerospace  Conference  ( 1  published  &  presented  paper) 

5  March  2007  Test  VLSI  Chip  #1  submission  for  manufacture 

21  March  2007  Submission  to  BOARD  of  the  Window  on  Science  Trip  Report 

2 1  March  2007  Submission  to  BOARD  of  the  Interim  report 

13-17  August  2007  USU/AIAA  Small  Satellite  Conference  (2"‘^  published  &  presented  paper) 
1 1  September  2007  Test  VLSI  Chip  #2  and  #3  submission  for  manufacture 
1  November  2007  AIAA  Journal  of  Spacecraft  Rockets  (3'^‘^  published  paper) 

19  November  2007  Test  VLSI  Chip  #4  submission  for  manufacture 

8  February  2008  IEEE  Int.  Symposium  on  Circuits  and  Systems  (4*  paper  accepted  for 

publication,  presentation  is  in  May  2008) 

3  March  2008  Contract  effort  completion  and  submission  of  the  final  report  to  BOARD 

3  March  2008  Test  VLSI  Chip  #5  submission  for  manufacture 
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1.3  Schedule  of  Test  VLSI  Chip  Fabrication 

Originally,  two  test  VLSI  chips  were  proposed  for  fabrication.  Ultimately,  five  chips  were 
fabricated  as  shown  in  Table  1.  Results  from  these  test  chips  are  reported  on  in  Chapters  3  and  4. 


Table  1.  Test  Chip  Fabrication 


Test  Chip 

Date 

Size  (mm^) 

Purpose 

1 

5  Mar  07 

2 

Solar  cells  ( 1  design) 

2 

1 1  Sep  07 

3.5 

Synchronous/commercial  cells  +  test  structures 

3 

1 1  Sep  07 

3.5 

Synchronous/hardened  cells 

4 

19  Nov  07 

7 

Asynchronous/hardened  cells  +  solar  cells  (2"‘^  design) 

5 

3  Mar  08 

2 

Solar  cells  (final  design) 

1.4  Outline  of  Report  Contents 

Chapter  Two,  Mission  Needs  Statement,  first  presents  the  compelling  need  for  this  research 
followed  by  a  brief  literature  review.  The  chapter  concludes  with  a  technology  development 
roadmap  for  this  and  future  work. 

Chapter  Three,  SiGe  BiCMOS  Solar  Cells,  presents  the  first  of  two  technology  focus  areas  for  this 
research.  The  novel  invention  of  monolithically  integrated  solar  cells  on  CMOS  is  presented, 
complete  with  the  basic  theory,  design,  and  test  results. 

Chapter  Four,  Synergy  of  Radiation  Hardening  by  Design  of  Asynchronous  Logic,  presents  the 
second  focus  of  this  research.  A  case  study  of  harmoniously  combining  two  existing  technologies 
is  presented,  which  results  in  CMOS  designs  that  are  tolerant  to  radiation  and  temperature 
extremes,  in  addition  to  voltage  and  process  parameter  fluctuations.  The  case  study  gives  the 
background  of  these  two  technologies,  describes  the  core  design  for  comparison,  then  presents  the 
simulation  and  hardware  test  results. 

Chapter  Five,  Conclusions,  summarizes  the  results  of  the  research  and  proposes  directions  for 
future  work. 
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2  Mission  Needs  Statement 

2.1  The  Satellite  on-a-Chip  Concept 

The  satellite-on-a-chip  idea  has  sparked  a  lot  of  interest  in  the  space  community,  since  the  first 
known  mention  of  the  concept  in  the  1993-1994  timeframe  [l]-[5].  An  initial  satellite-on-a-chip 
feasibility  study  was  completed,  based  on  a  monolithic  system-on- a-chip  (SoC)  design,  but  the 
lack  of  viable  applications  discouraged  further  development  initially  [6].  In  response,  the  future 
need  for  low-cost  mass-producible  very  small  satellites  for  distributed  space  missions  was 
examined  [7], 

The  smallest  silicon-based  mass-producible  technique  for  satellite  fabrication  was  proposed  by 
Janson  and  Helvajian  from  the  Aerospace  Corporation  in  starting  in  1993  [l]-[4].  Well  beyond  the 
scope  of  SoC,  the  vision  was  to  build  satellites  out  of  stacks  of  silicon  wafers  processed  by 
complementary  metal-on-silicon  (CMOS),  microelectromechanical  system  (MEMS),  and 
photovoltaic  foundries.  Their  team  has  since  pioneered  a  range  of  small  satellite  manufacturing 
technologies  [8].  The  high  cost  of  commercializing  these  processes  has  prevented  widespread 
implementation. 

The  Surrey  Space  Centre  set  a  long-term  goal  in  1999  of  developing  and  flying  the  world’s  first 
satellite-on-a-chip,  based  on  a  true  stand-alone  SoC  approach.  Since  that  time,  they  have 
facilitated  numerous  research  efforts  towards  that  goal.  The  monolithic  SoC  approach  has  been 
challenged  by  various  packaging  alternatives,  including  traditional  printed  circuit  board  (PCB), 
multichip  module  (MCM),  system-in-package  (SiP),  and  now  system-on-package  (SOP); 
however,  SoC’s  attraction  is  its  low  cost  and  mass-producibility. 

Related  prototyping  design  activities  have  been  undertaken,  targeting  a  system  mass  less  than  one 
kilogram,  leading  to  a  70  g  satellite-on-a-PCB  prototype.  This  very  small  satellite  design,  named 
PCBSat,  has  given  insight  into  various  aspects  of  satellite  system  development  on  a  very  small 
scale.  Although  developed  as  a  prototype,  it  gives  rise  to  a  promising  cost-effective  mass- 
producible  solution  for  certain  large-scale  distributed  space  missions  [9]. 

Satellite-on-a-chip  has  gained  new  appeal  in  the  context  of  space  sensor  networks  [10].  Nearly  all 
wireless  sensor  network  applications  to  date  have  been  for  relatively  benign  terrestrial 
environments,  with  a  few  exceptions  where  thermal  extremes  are  concerned. 
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2.2  Wireless  Sensor  Networks 

The  wireless  sensor  network  eoncept  emerged  in  the  early  1990’s,  with  academie  roots  that  ean  be 
traced  through  an  original  group  of  researchers  at  the  University  of  California,  Los  Angeles  [11]. 
Various  terms  have  been  used  to  describe  this  concept  over  the  past  decade,  but  “wireless  sensor 
networks”  has  endured.  In  addition  to  developing  the  theory  and  supporting  software,  three 
hardware  solutions  for  sensor  nodes,  sometimes  called  motes,  were  initially  pursued:  Smart  Dust, 
commercial  off-the-shelf  (COTS)  Dust,  and  Wireless  Integrated  Sensor  Networks  (WINS). 

Although  the  actual  idea  of  Smart  Dust  was  bom  at  a  1992  U.S.  military  workshop,  Pister  [12]  is 
credited  with  coining  the  phrase  and  the  first  major  development,  shortly  after  leaving  UCLA  for 
Berkeley.  The  first  Smart  Dust  implementation  was  a  battery-powered  MCM  featuring  a  MEMS 
comer  cube  reflector  for  optical  communications  [13].  Pister’ s  team  went  on  to  demonstrate  a 
solar-powered  variant  soon  after,  as  shown  in  Figure  1  [14]. 


Figure  1.  Smart  Dust 


The  new  Berkeley  team  developed  COTS  Dust  in  parallel  to  Smart  Dust.  As  shown  in  Figure  2, 
this  concept  was  based  on  a  PCB  substrate,  with  three  versions  utilizing  radio  frequency  (RF) 
communications  while  one  used  optical  [15].  Spin-off  companies  emerged,  such  as  Crossbow, 
which  now  sell  the  popular  MICA  family  of  motes.  To  simplify  their  implementation,  the  TinyOS 
operating  system  is  now  widely  used  in  most  motes. 

While  Smart  Dust  was  in  development,  four  of  the  original  UCFA  academics,  led  by  Kaiser  [16], 
pursued  an  RF-based  SoC  called  WINS.  Upon  closer  inspection,  their  approach  was  actually 
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based  on  MCM  integration  of  a  sensor,  mieroproeessor,  and  transeeiver;  whieh  is  similar  to 
optieal  Smart  Dust,  but  uses  an  RF  link.  Kaiser  went  on  to  lead  further  integrated  RF  work  in 
CMOS;  however,  no  recent  work  on  WINS  has  been  published  in  the  literature. 


Figure  2.  COTS  Dust 


One  of  the  most  promising  SoC  projects  is  WiseNET,  which  has  successfully  integrated  a  radio, 
microprocessor,  data  storage,  power  control,  and  analog  interface,  as  shown  in  Figure  3  [17]. 
Although  closer  to  a  true  SoC  solution,  the  WiseNET  sensor  node  still  requires  numerous  external 
components,  including  a  power  source,  passive  devices,  an  antenna,  and  sensor. 


Figure  3.  WiseNET  SoC  Seusor  Node 
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In  response  to  WiseNET,  The  Smart  Dust  team  recently  published  a  comprehensive  investigation 
of  an  RE -based  SoC  approach  [18],  It  includes  a  discussion  on  the  remaining  work  to  realize  a 
complete  stand-alone  SoC  implementation.  They  concluded  that  although  recent  SoC  solutions 
have  demonstrated  increased  monolithic  integration,  many  large  off-chip  components  are  still 
required,  such  as  a  sensor,  battery,  passives,  crystal  clock  source,  and  RE  antenna.  Completed 
during  the  same  period,  our  initial  satellite-on-a-chip  feasibility  assessment,  with  similar 
objectives,  arrived  at  the  same  conclusions  [6]. 


2.3  Survivable  SoC  Node  Requirements 

In  this  section,  we  discuss  functional  requirements  for  a  self-powered  SoC  sensor  node  design 
aimed  at  hostile  environments.  A  range  of  potential  solutions  for  a  generalized  set  of  functional 
requirements  is  presented,  focusing  on  the  following  aspects: 

•  Missions  and  sensors 

•  System  configuration 

•  Power  generation,  storage,  distribution,  and  control 

•  Data  handling,  processing,  and  storage 

•  Wireless  communications  with  other  nodes 

•  Environmental  operability  and  survivability 

2.3.1  Missions  and  Sensors 

The  chosen  SoC  approach  greatly  limits  payload  options.  Considering  a  case  study  mission 
presented  in  [  1 9],  no  on-chip  sensors  are  possible  to  detect  plasma,  due  to  the  physical  geometries 
required.  However,  the  following  sensors  are  routinely  manufactured  in  CMOS  [20]: 

Table  2  Typical  CMOS  Sensors 

•  Visible  •  Infrared  •  Ultraviolet  •  Magnetic 

•  Radiation  •  Temperature 

CMOS  imagers  are  growing  in  popularity  and  may  eventually  replace  charge-coupled  devices 
(CCD)  for  most  imaging  applications  [21].  Unlike  CCDs,  CMOS  imagers  use  mainstream 
semiconductor  fabrication  techniques,  require  less  power,  and  can  be  integrated  monolithically 
with  image  co-processors.  Complete  camera- on- a- chip  devices  are  now  emerging  [21].  Typically, 
a  separate  lens  is  required  to  focus  the  image  on  the  sensor,  but  microlenses  can  now  be  integrated 
monolithically  [22]. 
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Recently,  a  wide  range  of  sensors  has  emerged,  based  on  “CMOS-MEMS”  technology.  CMOS- 
MEMS  requires  custom  pre-,  front-end,  and/or  back-end  processing  of  the  CMOS  wafer.  Of  these 
three  methods,  back-end  bulk  micromachining  of  CMOS  has  been  the  most  successful.  Due  to  its 
growing  popularity,  a  few  commercial  foundries  now  offer  limited  CMOS-MEMS  processing, 
such  as  X-FAB.  Table  3  lists  some  sensors  that  have  been  demonstrated  [23]. 


Table  3.  Typical  CMOS  MEMS  Sensors 


•  Pressure 

•  Chemical 

•  Thermal 

•  Tactile 

•  Proximity 

•  Flow 

•  Force 

•  Neural 

•  Vacuum 

•  Acceleration 

•  Gyroscopic 

•  Audio 

2.3.2  System  Configuration 

With  a  goal  of  a  SoC  implementation,  the  configuration  is  essentially  fixed  to  the  planar  nature  of 
a  silicon  chip.  CMOS  technology  is  the  most  widely  used  microelectronics  fabrication 
technology,  due  to  its  low  cost  at  high  volume.  A  maximum-sized  prototype  integrated  circuit 
(IC)  design,  using  a  multi-project  vendor  such  as  MOSIS  or  EUROPRACTICE,  start  at  $2,400 
per  die  depending  on  the  technology,  while  a  production  run  would  cost  about  $300  each. 
Currently,  feature  sizes  of  45  nm  are  possible,  which  will  only  shrink  in  time  [24].  CMOS 
technology  options  have  broadened  over  the  past  decade  with  the  introduction  of  processes 
optimized  for  radio  frequency  (RE),  optical  sensors,  integrated  bipolar  transistors  (SiGe 
BiCMOS),  and  non-volatile  “flash”  memory. 

The  primary  advantage  of  a  monolithic  approach  is  the  manufacturing  simplicity.  However,  it 
does  not  allow  the  attachment  of  discrete  components  or  the  merging  or  various  elements  into  a 
hybrid  assembly,  which  imposes  considerable  limitations.  Most  notably,  the  design  cannot  exceed 
the  reticle  size,  which  is  a  physical  area  limit  imposed  by  the  photolithography  process  used  in  the 
particular  semiconductor  process  line.  This  caps  the  maximum  circuit  area  to  approximately  400 
mm^  (20x20  mm)  for  modem  CMOS  processes  [24].  Assuming  a  silicon  density  of  2330  kg/m^ 
and  wafer  thickness  of  0.75  mm,  the  die  mass  is  approximately  one  gram. 

In  1967,  a  technique  called  wafer-scale  integration  (WSI)  was  proposed  to  overcome  the  reticle 
limit  [25].  WSI  allows  multiple  reticle-sized  designs  to  be  co-located  on  the  same  wafer,  and  then 
connected  together  using  various  interconnection  techniques.  This  would  allow  a  final  product 
that  in  theory  could  be  as  large  as  the  entire  wafer,  which  could  be  as  large  as  300  mm  in  diameter 
[24].  Unfortunately,  inherent  defects  in  the  semiconductor  manufacturing  process  have  prevented 
WSI  from  becoming  widely  adopted  [26]. 

Multichip-Module  (MCM)  technology  eventually  replaced  WSI  for  designs  requiring  more  area 
[26].  MCMs  integrate  unpackaged  “known-good-die”  on  a  range  of  substrates,  such  as  printed 
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circuit  boards  (PCBs),  thin  films,  and  ceramics  using  fine  line  interconnects.  MCM  technology, 
including  three-dimensional  variants,  has  already  been  used  in  satellite  applications  [27],  MCMs 
or  other  system-in-package  (SiP)  techniques  are  typically  used  in  applications  where  integrated 
density  or  performance  is  essential  [28],  For  less  demanding  applications,  evolutionary 
advancements  in  IC  packaging  make  traditional  PCBs  a  cost-effective  choice. 

Despite  a  growing  number  of  packaging  alternatives,  SoC  technology  is  rapidly  advancing. 
Popular  MCM-based  miniaturization  efforts,  such  as  “Smart  Dust,”  are  now  looking  to  SoC  for 
further  miniaturization  of  their  terrestrial  wireless  sensors  [18]. 

2.3.3  Power  Generation,  Storage,  Distribution,  and  Control 

Power  distribution,  regulation,  and  control  aspects  of  an  EPS  can  be  met  with  basic  wiring, 
switching,  and  regulation  circuitry  that  are  routinely  implemented  in  CMOS  [29].  Recent  “micro 
power”  research  has  presented  several  new  integrated  options  for  SoC  applications,  presented  in 
Table  4  [30]. 

Table  4.  Micro  Power  Sources 

•  Solar  cells  •  Fuel  Cell  •  Vibration  •  Induction 

•  Chemical  battery  •  Nuclear  battery  •  Microturbine 

Power  generation  via  integrated  solar  cells  on  CMOS  is  the  most  straightforward  solution,  but  has 
not  yet  been  demonstrated  successfully.  Typically,  solar  cells  are  fabricated  with  optimized 
silicon  (Si)  or  gallium  arsenide  (GaAs)  processes,  optimized  for  efficiency  and  distinctly  different 
from  commercial  CMOS.  Integrating  solar  power  with  digital  circuitry  has  not  been  of  interest 
until  recently.  The  first  Smart  Dust  prototype  was  implemented  as  a  MCM  and  attached  to  an 
external  battery  [13],  then  later  used  MCM  integration  to  incorporate  solar  cells  [14],  and  finally 
demonstrated  a  monolithic  solution  using  a  custom  silicon-on- insulator  (SOI)  process  [31]. 
Although  SOI  is  growing  in  popularity,  it  is  not  yet  widely  available. 

Truly  monolithic  self-powered  devices  in  CMOS  have  been  proposed.  Three  such  examples  are  a 
sensor  network  processor  [32],  an  implantable  device  to  cure  human  blindness  [33],  and  other 
basic  research  [34].  Unfortunately,  none  of  these  efforts  reported  any  success  in  hardware, 
including  efficiency  results.  In  private  correspondence  with  Blaauw  [32],  it  was  revealed  that  their 
efficiency  was  less  than  1%.  Castaner  explains  that  the  CMOS  process  imposes  some  restrictions 
that  drastically  reduce  the  efficiency  of  solar  cells.  His  approach  is  similar  to  other  efforts,  using 
advanced  packaging  techniques  to  create  self-powered  SiP  designs  [35].  Obviously,  with  a 
maximum  efficiency  of  1%,  commercial  CMOS  presents  a  challenge.  A  novel  solar  cell  design  is 
presented  later  in  this  report. 
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A  monolithically  integrated  chemical  fuel  cell  has  been  demonstrated  with  an  operating  time  of 
170  hours  and  mean  open-circuit  voltage  of  0.533V  [36].  Unfortunately,  it  relies  on  an  oxygen- 
rich  atmosphere,  which  is  not  suitable  for  space,  but  will  work  terrestrially.  In  addition,  no  under 
load  performance  data  is  presented.  Other  micro  chemical  power  supplies,  such  as  thin-film 
batteries  [37],  nuclear  batteries,  and  microturbines  have  been  investigated,  but  none  can  be 
monolithically  integrated. 

Mechanical  energy  is  typically  converted  by  electromechanical  generators,  but  piezoelectric 
power  generation  is  also  possible.  Work  is  underway  in  piezoelectric  micro  power  sources,  but 
not  yet  for  SoC  [38].  Another  promising  source  of  integrated  electrical  power  is  through  inductive 
energy  transfer.  This  has  been  shown  in  a  monolithic  SoC  for  medical  implants  [39]. 

2.3.4  Data  Handling,  Processing,  and  Storage 

The  DH  subsystem  provides  a  range  of  on-board  computing  services.  It  receives,  validates, 
decodes,  and  distributes  commands  from  the  ground,  payload,  or  a  subsystem  to  other  spacecraft 
subsystems.  It  also  gathers,  processes,  and  formats  spacecraft  housekeeping  and  mission  data  for 
downlink  or  use  on  board.  DH  subsystems  are  usually  the  most  difficult  to  define  early  in  the 
design  due  to  the  initially  vague  requirements  of  the  payload  and  subsystems. 

At  a  minimum,  the  DH  subsystem  is  composed  of  a  central  processing  unit  (CPU)  and  supporting 
memory  elements.  The  difficult  part  of  the  design  is  the  hardware  interface  to  the  other  systems, 
typically  using  a  digital  data  bus  and  analog-to-digital  converters  (ADC).  For  monolithic  sensor 
nodes,  a  minimal  reduced  instruction  set  (RISC)  CPU  design  is  all  that  can  be  supported  by  the 
available  power.  An  on-chip  ring  oscillator  with  selectable  frequency  output  and  power  up  reset 
can  be  used  to  run  the  CPU.  Some  introductory  thought  has  already  been  given  to  miniaturizing 
flight  computer  components  to  a  single  chip,  reflecting  a  growing  trend  in  SoC  development  [40]. 

One  issue  that  plagues  data  handling  systems  operating  in  space  is  the  extreme  radiation  and 
thermal  environment,  especially  considering  that  the  proposed  system  architecture  is  a  bare  die  in 
space  with  no  shielding.  Additionally,  low  power  operation  is  essential,  considering  the  small 
surface  area  for  integrated  solar  cells  as  discussed.  A  unique  solution  for  this  issue  is  the  second 
focus  of  this  research  effort  and  is  presented  in  the  next  section.  It  combines  two  atypical  design 
techniques  to  enable  low-power  operation  in  hostile  environments. 

2.3.5  Wireless  Communications  with  Other  Nodes 

The  original  Smart  Dust  design  presented  in  [13]  used  optical  communications  to  take  advantage 
of  its  power  efficiency.  Optical  links  are  also  free  of  regulatory  issues  and  can  use  simple  on/off 
keying  (OOK)  modulation  schemes.  This  approach  is  only  effective  in  line-of-sight  situations 
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where  the  alignment  is  controlled.  For  sensor  networks  within  a  larger  spacecraft,  line  of  sight 
would  be  difficult.  For  free-flying  nodes,  the  alignment  problem  becomes  the  predominant  issue. 

Low-power  on-chip  transceivers  have  become  the  preferred  choice  for  sensor  nodes.  SoC 
transceivers,  which  were  a  novelty  only  a  few  years  ago  are  now  commercially  available,  some 
even  with  an  integrated  microcontroller  [41].  The  commercial  availability  of  RF  CMOS  and  SiGe 
BiCMOS  processes  has  offered  increased  capabilities,  including  a  wider  selection  of  operating 
frequencies.  SoC  transceivers  still  require  external  passive  elements,  crystal  oscillators,  and  an 
antenna.  In  an  effort  to  eliminate  external  antennas,  on-chip  antennas  have  been  investigated.  The 
maximum  range  achieved  is  approximately  five  meters,  as  demonstrated  by  Lin  [42]  and  O  [43]. 
Due  to  a  20x20  mm  reticle  size,  most  experiments  use  frequencies  over  3.75  MFIz,  which  gives  a 
quarter-wavelength  antenna  size  of  20  mm  or  smaller.  On-chip  antennas  for  the  900  MFIz  and  2.4 
GFIz  Instrumentation,  Scientific,  and  Medical  (ISM)  bands  are  not  feasible  as  they  are  12.5  cm 
and  3.1  cm  respectively.  5.8  GHz  ISM  fits  at  1.3  cm.  Unfortunately,  higher  frequencies  require 
more  power  given  the  same  desired  range  than  lower  frequencies. 

Another  technology  related  to  wireless  sensor  networks  is  RFID.  The  basic  concept  was  explained 
in  1948  and  arguably  was  envisaged  before  this  time  [44].  This  technology  was  not  used  much 
until  the  1970s,  when  it  saw  some  widespread  use  in  automated  vehicle  identification  for  various 
purposes,  such  as  road  tolls.  Technology  has  allowed  miniaturization  to  the  point  where  RFID 
“tags”  can  be  made  monolithically,  including  an  antenna,  with  a  range  of  a  few  meters  [45]. 

2.3.6  Environmental  Operability  and  Survivability 

Emerging  wireless  sensor  network  applications  for  hostile  environments  has  prompted  an 
investigation  into  survivable  sensor  node  design  techniques,  which  currently  do  not  exist.  The 
following  five  environmental  hazard  categories  are  introduced  and  discussed  further. 

(1)  Mechanical  (shock,  vibration,  acceleration) 

(2)  Atmospheric  (corrosion,  debris,  vacuum) 

(3)  Thermal  (extremes,  limited  heat  transfer) 

(4)  Energetic  (radiation,  including  charged  particles) 

(5)  Dynamic  (free-fall  orbit,  high  velocity  mobility,  attitude  disturbance  torques) 

Mechanical  (shock,  vibration,  acceleration) — Fragile  MEMS  structures  are  not  suitable  for 
applications  where  excessive  shock,  vibration,  and/or  acceleration  may  exist.  These  hazards  are 
seen  in  the  space  launch  segment  and  industrial  process  plants.  The  mechanical  rigidity  of  a 
monolithic  SoC  is  far  superior  in  this  case. 
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Atmospheric  (corrosion,  debris,  vacuum) — Corrosion  is  an  issue  for  low-Earth  orbit  (LEO), 
industrial/chemical,  and  biomedical  applications.  Any  exposed  aluminum  on  a  SoC  must  be 
covered,  either  by  gold  plating  or  by  passivation.  Space  debris  is  normally  considered  a  hazard  for 
satellites,  but  for  a  mission  where  thousands  of  objects  are  put  in  space,  they  become  a  big 
concern  to  other  systems.  The  only  realistic  way  to  solve  this  problem  is  to  confine  these  missions 
to  LEO,  where  the  orbital  lifetime  is  very  short,  essentially  making  these  missions  disposable. 
Finally,  the  vacuum  of  space  introduces  several  issues,  such  as  cold  welding  and  outgassing,  but 
for  SoC,  the  only  concern  is  limited  heat  transfer. 

Thermal  (extremes,  limited  heat  transfer) — Thermal  extremes  and  cycling  are  exacerbated  in  a 
vacuum,  as  thermal  radiation  is  the  only  method  available  for  heat  transfer.  Silicon  wafer  thermal 
properties  are  well  understood  and  certain  packaging  materials  can  be  used  to  manage  the 
temperature  extremes  for  a  SoC.  For  example,  space-qualified  paraffin  can  be  used  to  absorb  heat 
during  the  sunlit  portion  of  an  orbit,  and  then  keep  the  system  warm  during  eclipse,  effectively 
narrowing  the  temperature  range  the  SoC  will  experience. 

Energetic  (radiation,  including  charged  particles) — Extreme  radiation  conditions  are  usually 
experienced  in  nuclear  power  plants,  certain  industrial  process  plants,  and  in  space.  Ionizing 
radiation  causes  gradual  system  degradation  as  the  dose  accumulates.  In  addition,  high-energy 
particles,  such  as  electrons,  protons,  and  heavy  ions,  can  cause  single  event  phenomena.  This 
environment  is  discussed  in  detail  in  Chapter  4. 

Dynamic  (free-fall  orbit,  high  velocity  mobility,  attitude  disturbance  torques) — Terrestrial  sensor 
networks  are  composed  of  relatively  fixed  nodes.  In  contrast,  orbital  velocity  in  low  Earth  orbit 
(LEO)  is  approximately  7.5  km/s.  Natural,  but  undesirable  perturbations  change  the  orbit  over 
time,  altering  the  arrangement  of  nodes,  which  is  called  a  constellation.  This  factor  must  be  frilly 
understood,  so  key  parameters  like  communication  range  can  be  selected  properly.  The  freefall 
environment  also  presents  unique  challenges.  The  dominant  effect  is  that  objects  in  orbit  “float” 
and  change  their  orientation  or  “attitude”  based  on  perturbations  from  solar  pressure,  gravity 
gradients,  magnetic  fields,  and  aerodynamic  drag.  This  may  not  be  an  issue  if  the  sensor 
technology  does  not  have  pointing  requirements.  However,  if  attitude  control  is  required,  SoC 
solutions  are  very  challenging  at  this  scale. 

2.4  Analysis 

The  ultimate  SoC  vision  for  any  application  is  a  stand-alone  product  that  can  be  used  directly  off 
the  CMOS  process  line  without  any  additional  components,  packaging,  or  interfaces.  A  survivable 
SoC  has  additional  features  and  functional  requirements  as  just  outlined.  Based  on  our  experience 
with  very  small  satellite  design  we  have  identified  the  following  research  areas  that  are  worth 
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pursuing  further  in  order  to  realize  a  true  SoC  implementation  of  a  survivable  wireless  sensor 
node: 

(1)  Sensors 

(2)  Power  generation  and  storage 

(3)  Asynchronous  system  architecture 

(4)  Transceivers  and  antennas 

(5)  Attitude  control 

(6)  Location  and  time  determination 

(7)  Propulsion 

(8)  Environmental  extremes  tolerance 

Our  aim  is  to  not  only  help  achieve  the  vision  of  a  stand-alone  SoC,  but  to  design  a  system  that 
can  withstand  hostile  environments,  particularly  those  encountered  in  space  missions.  A  notional 
system  configuration  is  illustrated  in  Figure  4. 
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Figure  4.  Notional  System  Configuration 

In  the  above  context  the  focus  of  this  research  is  on  developing  and  demonstrating  two  key 
capabilities: 

Power  Generation  via  Solar  Cells — Numerous  integrated  power  sources  have  been  studied,  but  all 
have  remained  elusive  for  commercial  CMOS.  Integrated  solar  cells  seem  to  be  the  most  relevant 
approach.  Of  the  few  published  attempts,  only  one  has  achieved  an  estimated  and  unverified 
efficiency  of  one  percent.  If  successful,  this  development  could  be  applied  to  rapidly  growing 
number  of  standalone  SoC  applications.  Progress  in  this  area  is  reported  on  in  Chapter  3. 
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Radiation  Hardening  by  Design  of  Asynchronous  Logic — Asynchronous  design  for  hostile 
environments  through  the  application  of  radiation  hardening  by  design  has  only  been  briefly 
considered  previously.  Most  efforts  focus  on  implementing  redundant  logic  to  overcome  upsets 
from  radiation  sources.  The  challenge  is  that  the  asynchronous  power  efficiency  gain  is  partially 
offset  by  power  and  area  hungry  design  hardening  techniques.  A  case  study  on  this  issue  is 
presented  in  Chapter  4. 
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3  SiGe  BiCMOS  Solar  Cells 


As  discussed  in  the  previous  chapter,  solar  cells  are  typically  fabricated  with  dedicated  silicon  or 
gallium  arsenide  processes  optimized  for  efficiency,  then  strung  together  externally  with  the 
appropriate  series  and  parallel  connections  to  achieve  the  desired  voltage  and  current  output. 
Regarding  monolithically  integrated  cells,  CMOS  does  not  provide  insulating  features,  as  SOI, 
which  facilitates  series  connections,  but  is  not  widely  available.  Consequently,  monolithic  CMOS 
solar  cell  research  is  limited  to  a  few  attempts  with  unreported  results  [32]- [34],  apart  of  an 
estimated  efficiency  of  1%  [32].  A  novel  approach  to  monolithic  solar  cell  design  in  CMOS  is 
presented  here,  which  aims  to  overcome  the  limitations  of  previous  implementations.  This 
technology  development  can  be  applied  to  a  rapidly  growing  number  of  SoC  applications. 


3.1  Basic  Solar  Cell  Theory  of  Operation 

Solar  or  photovoltaic  cells  are  devices  that  convert  light  energy  or  photons  into  electric  current. 
Although  modern  day  solar  cells  are  derived  from  semiconductor  technology  made  popular  by  the 
invention  of  the  transistor  in  1947,  crude  photovoltaic  cells  have  been  in  use  before  1900.  The 
basis  of  a  modem  photovoltaic  cell  is  the  p-n  junction  of  a  crystalline  semiconductor  material, 
such  as  Germanium  (Ge),  Silicon  (Si),  Gallium  Arsenide  (GaAs),  or  numerous  other  compounds. 

In  silicon  for  example,  the  p  and  n  regions  are  created  by  introducing  dopant  materials,  such  as 
boron  (B)  or  phosphorous  (P),  respectively.  Boron  has  one  less  valence  electron  than  Silicon,  so 
its  introduction  in  the  crystal  lattice  creates  an  absence  of  an  electron,  called  a  hole  (+).  Similarly, 
Phosphorous  has  one  more  valence  electron  than  Silicon,  creating  an  excess  electron  (-).  The  p-n 
junction  is  created  from  a  single  crystal.  Under  normal  conditions,  excess  holes  from  the  p-type 
material  migrate  to  the  n-type  material  while  excess  electrons  in  the  n-type  material  migrate  to  the 
p-type  material,  where  electron-hole  recombination  takes  place  until  equilibrium  is  reached  [47]. 
Under  illumination,  most  of  the  photon  energy  is  absorbed  at  the  surface  of  the  material,  creating 
excess  electron  hole  pairs,  reversing  this  migration  condition  as  illustrated  in  Figure  5. 
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light 


Figure  5.  Illuminated  p-n  Junction  Photovoltaic  Effect 

To  harness  the  photovoltaic  energy,  an  ohmic  contact  is  placed  on  either  side  of  the  p-n  junction 
as  shown  in  Figure  6.  The  left  side  of  the  figure  indicates  the  accepted  voltage  polarity 
convention,  where  the  ground  (-)  probe  of  the  voltmeter  is  placed  on  the  n-type  material  and  the 
positive  (-t)  probe  is  placed  on  the  p-type  material.  Under  illumination,  the  open  circuit  voltage  is 
a  positive.  On  the  right  side  of  the  figure,  the  short  circuit  current  is  illustrated,  where  the  current 
flow  is  positive,  indicating  the  flow  of  holes  in  the  direction  shown. 


light  light 


3.2  Integrated  SiGe  BiCMOS  Solar  Cell  Design 

The  0.35  pm  SiGe  BiCMOS  (S35)  process  from  austriamicrosystems  (AMS)  is  used  throughout 
this  work  due  to  its  cost  effectiveness,  lack  of  light-blocking  layers,  and  support  for  integrated 
radio  in  future  work.  However,  the  NPN  SiGe  bipolar  junction  transistor  (BJT)  structure  is  the 
primary  reason  for  selecting  this  technology,  as  it  provides  a  semi-isolated  p-n  junction  at  the 
surface.  Bulk  CMOS  only  supports  an  n-well  based  n-p  junction,  which  cannot  be  isolated  for 
series  connections  and  produces  a  negative  voltage  with  respect  to  ground,  which  is  the  p-type 
wafer.  This  experimental  photocell  design  is  investigated  and  reported  on.  Not  every  detail  of  the 
AMS  process  is  given  due  to  the  academic  non-disclosure  agreement  in  force. 

The  novel  photocell  design  utilizes  NPN  SiGe  large  area  transistors,  which  are  thin  and  close  to 
the  surface.  The  standard  NPN  SiGe  BJT  structure  is  modified  to  maximize  the  collector-base 
(CB)  interface  and  minimize  the  emitter  (E)  contact  area,  which  is  left  floating.  A  conceptual  side 
view  drawing  (not  to  scale)  is  shown  in  Figure  7. 
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Figure  7.  Photocell  Design  Concept  (Side  View) 

A  closer  inspection  of  Figure  7  reveals  the  essential  physical  elements  of  the  design.  Starting  from 
the  bottom,  the  AMS  S3  5  technology  uses  a  p-type  substrate,  common  to  most  commercial 
CMOS  processes.  To  create  the  collector  (C),  a  n+  sinker  and  buried  layer  are  required  to  contact 
the  buried  n-well.  On  top  of  the  collector,  the  base  (B)  is  formed  of  a  thin  p-type  SiGe  layer, 
where  polysilicon  (not  shown)  is  used  to  make  the  base  contact.  Field  oxide  (fox)  insulates  the 
base  from  the  surrounding  elements.  The  emitter  (E)  is  a  small  amount  of  n-type  material 
connected  by  polysilicon  (not  shown)  to  create  the  complete  NPN  structure.  The  emitter  is  left 
floating  and  is  kept  as  small  as  possible  to  maximize  incident  light  while  satisfying  the  process 
design  rules.  Finally,  the  polysilicon  (polyl)  through  metal  layer  four  (met4)  are  shown  to 
illustrate  that  regular  placement  of  these  layers  is  required  to  satisfy  the  coverage  and  slotting 
rules  of  the  process.  Unfortunately,  these  layers  reduce  the  overall  efficiency. 

The  advantageous  placement  of  field  oxide  (fox)  in  the  NPN  design  is  what  makes  series 
connections  possible  in  SiGe  BiCMOS  and  not  bulk  CMOS.  Making  the  series  and  parallel  cell 
connections  is  straightforward  with  this  single-cell  design.  As  shown  in  Figure  7,  these  cells  are 
arranged  for  a  series  connection,  raising  the  voltage  at  each  increment.  The  base  (B)  of  one  cell  is 
connected  to  the  neighboring  collector  (C)  through  vias  to  the  metal  layers  above  (not  shown). 
Looking  at  the  cell  design  from  the  top.  Figure  8  illustrates  how  the  field  oxide  completely 
isolates  the  p-type  SiGe  base  (B)  from  the  adjacent  material.  Flowever,  this  design  is  not  as 
efficient  as  a  similar  one  in  SOI,  as  there  is  no  insulating  layer  available  between  the  bottom  n-l- 
buried  layer  and  the  p-substrate  as  shown  in  Figure  7.  Figure  9  illustrates  the  physical  layout  in 
the  Cadence  computer  aided  design  (CAD)  package,  mirroring  the  view  in  Figure  8. 
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Figure  8.  Photocell  Design  Concept  (Top  View) 


Figure  9.  Photocell  Design  Concept  (Cadence  Layout  View) 


light  light  light 


Figure  10.  Photocell  Design  Concept  (Schematic  View) 

Figure  10  is  a  hybrid  view  of  the  layout  and  sehematie.  It  is  essential  to  understand  that  while 
most  light  is  absorbed  at  the  top  layer,  some  penetrates  into  the  material  and  aetivates  the  lower  n- 
p  junction  at  the  substrate,  as  well  as  the  n-p  junction  of  the  sidewalls.  All  electron  hole 
migrations  are  illustrated,  giving  the  desired  positive  bias  with  respect  to  the  substrate. 
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Figure  1 1  is  the  final  layout  in  Cadenee  of  the  first  test  ehip  from  run  1550  and  a  mierograph  of  an 
unpaekaged  die  after  fabrieation.  The  die  is  1420x1420  pm  (1.42x1.42  mm),  whieh  is  the 
minimum-sized  test  ehip  in  the  AMS  S35  proeess.  Two  more  experimental  designs  are  reported 
on  in  the  next  seetion,  with  a  final  design  still  in  fabrieation  at  the  time  of  writing. 


Figure  11.  Cadence  Layout  and  Die  Micrograph  of  Solar  Cell  Test  Chip  #1 


3.3  Integrated  SiGe  BiCMOS  Solar  Cell  Test  Results 

Results  from  three  experimental  photoeell  designs  are  reported  on.  The  first  design  and  test  ehip  is 
shown  in  Figure  11.  The  schematie  of  the  design  is  similar  to  that  in  Figure  10;  however,  the  base 
(instead  of  the  emitter)  is  floating  on  eaeh  cell,  based  on  a  photocell  design  given  in  [21]. 
Secondly,  there  are  six  banks  of  photocells  in  parallel,  clearly  seen  in  Figure  11,  three  on  the  top 
and  three  on  the  bottom,  with  a  large  channel  in  between  the  sets,  and  smaller  channels  within  the 
sets  of  three.  The  various  channel  spacings  are  to  evaluate  the  isolation  qualities.  Additionally,  the 
six  hanks  of  photocells  have  all  collectors  (left)  and  emitters  (right)  connected  to  the  adjacent  test 
pads.  This  allows  for  selectable  external  series  connections  of  the  cells.  The  pads  at  the  top  and 
bottom  are  for  transistor  test  structures  and  are  discussed  in  section  4.4. 

Test  chip  results  reveal  that  the  NPN  CB  junction  is  not  activated  as  expected.  Upon  closer 
investigation,  the  reference  photocell  design  [21]  is  not  appropriate  for  this  application,  as  the  BE 
interface  acts  as  a  diode,  preventing  current  from  flowing  through  this  interface.  However,  the  test 
chip  allows  examination  of  the  underlying  n-well  to  p-substrate  junction  and  its  performance, 
which  has  some  value  as  efficiency  results  from  this  straightforward  approach  is  not  reported  in 
the  literature.  As  expected,  this  junction  has  a  negative  bias  with  respect  to  the  substrate,  which 
prevents  direct  application  of  the  power  from  the  cells. 
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Solar  cells  from  AMS  S35  run  1550  test  chips  are  subjected  to  AMO  solar  conditions  per  ASTM 
E-490  (1366.1  W/m^).  Summary  current  and  power  measurements  are  presented  in  Figure  12.  The 
average  efficiency  is  an  encouraging  2.4%,  vs.  1%  from  previous  work  [32].  The  actual  efficiency 
of  the  interface  is  8.3%,  without  considering  the  metallization  overhead. 
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Figure  12.  Solar  cell  current  vs.  voltage,  AMO,  test  chip  1550 

As  the  cause  for  the  unexpected  results  was  not  immediately  discovered,  further  examination  of  n- 
well  based  photocells  took  place.  To  potentially  improve  efficiency,  the  SiGe  layer  was  removed 
to  allow  more  light  to  penetrate  down  to  the  lower  n-well  junction.  The  improved  cells  were 
included  with  other  work  on  run  1791,  discussed  in  the  next  section.  They  demonstrate  3.44% 
efficiency  as  shown  in  Figure  13,  which  is  a  40%  improvement  over  the  first  attempt.  The 
interface  efficiency  alone  is  11.3%  without  considering  the  metallization  overhead.  Additionally, 
these  experiments  confirmed  that  integrated  solar  cells  can  be  integrated  with  CMOS  logic  with 
no  adverse  effects.  This  is  verified  in  the  hardware  testing  discussed  in  the  next  chapter. 
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Figure  13.  Solar  cell  current  vs.  voltage,  AMO,  test  chip  1791 

Until  the  cause  of  the  unexpected  results  on  the  first  test  chip  was  discovered,  an  on  chip  charge 
pump  was  considered.  Recently,  a  conventional  single-chip  (without  external  passives)  charge 
pump  design  has  been  reported  for  energy  harvesting  applications  using  external  solar  cells  [48]. 
Charge  pumps  are  commonly  used  on  flash  memory  devices  to  provide  the  required  higher 
voltage  for  write  operations.  Not  only  can  they  boost  the  voltage  considerably  above  the  power 
supply  levels,  they  can  invert  the  bias.  This  is  particularly  interesting  in  the  case  of  n-well  based 
photocells,  as  this  would  allow  for  both  bias  inversion  and  boosting  to  minimum  operating  levels. 
Unfortunately,  charge  pumps  will  not  work  in  this  situation,  as  the  minimum  bias  achieved  is 
0.5V,  which  is  half  of  the  required  charge  pump  start  up  voltage  of  l.OV.  Now  that  the  original 
SiGe  BiCMOS  design  is  corrected,  a  final  test  chip  has  been  submitted  on  run  1875,  which  will 
provide  a  positive  bias  at  the  process  operating  voltage  of  3.3V.  This  is  the  best  solution  overall, 
as  the  associated  inefficiency  of  charge  pumps  is  avoided. 
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4  Radiation  Hardening  by  Design 


Asynchronous  Logic 


A  case  study  supporting  the  development  of  environmentally  tolerant  logic  designs  is  presented. 
The  synergy  of  a  unique  asynchronous/hardened  design  approach  improves  the  tolerance  to 
radiation,  semiconductor  processing  variations,  voltage  fluctuations,  and  temperature  extremes. 
Radiation  hardening  by  design  (RHBD)  has  been  recognized  for  over  a  decade  as  an  alternative 
open-source  circuit  design  approach  to  mitigate  a  spectrum  of  high-energy  radiation  effects,  but 
has  significant  power  and  area  penalties.  Similarly,  asynchronous  logic  design  offers  potential 
power  savings  and  performance  improvements,  with  a  tradeoff  in  design  complexity  and  a  lesser 
area  penalty.  These  side  effects  have  prevented  wider  acceptance  of  both  design  approaches. 

4.1  Radiation  Hardened  by  Design  Background 

Extreme  radiation  conditions  are  usually  experienced  in  nuclear  power  plants,  some  industrial 
process  plants,  and  in  space.  Surprisingly,  in  the  early  days  of  IC  development,  alpha  particles 
from  impurities  in  plastic  packaging  caused  mysterious  anomalies  in  terrestrial  systems.  Neutrons 
occasionally  cause  errors  in  airplane  avionics  systems  flying  at  normal  cruising  altitudes  [49]. 
Space  and  various  nuclear  environments  are  more  challenging,  where  the  total  ionizing  dose 
(TID)  of  radiation  causes  gradual  system  degradation,  resulting  in  an  increase  in  power 
consumption.  In  addition,  high-energy  particles,  such  as  electrons,  protons,  and  heavy 
ions/galactic  cosmic  rays  (OCRs),  can  cause  single  event  effects  (SEE),  predominantly  upset 
(SEU),  latchup  (SEE),  and  recently,  transient  (SET).  Unnatural  effects,  such  as  enhanced  dose 
rate,  prompt  neutron  dose,  and  system  electromagnetic  pulse  (System  EMP)  are  not  discussed,  as 
they  are  only  concerns  for  hardened  military  systems. 

Mitigating  these  effects  has  historically  been  accomplished  with  a  system-level  approach,  which 
can  become  quite  expensive.  Heavy  shielding  of  various  types  can  be  used  to  reduce  TID  and 
System  EMP,  but  is  ineffective  against  SEE.  SEE  are  tolerated  and  detected,  typically  through 
triple  (or  more)  modular  redundancy  (TMR)  or  voting  schemes.  At  the  IC  level,  dedicated 
semiconductor  foundries  for  military  purposes  only  are  used  to  produce  hardened  components. 
These  hardened  foundries  are  typically  several  generations  behind  their  commercial  counterparts. 
One  open  source  radiation-hardening  solution  at  the  IC  level  is  the  application  of  RHBD,  which 
can  be  used  on  any  generation  process,  including  the  most  recent  [50].  The  guiding  principle 
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behind  RHBD  is  to  mitigate  as  many  of  the  radiation  effects  as  possible  by  using  unconventional 
layout  techniques  at  the  transistor  device  and  circuit  level. 

Beginning  with  TID,  the  degradation  mechanisms  must  first  be  understood  before  they  can  be 
mitigated.  CMOS  circuits  slowly  degrade  due  to  the  total  accumulated  dose  of  ionizing  radiation. 
This  degradation  is  seen  as  a  negative  shift  in  the  transistor  threshold  voltage  and  decrease  in  gain. 
With  enough  voltage  threshold  shift,  the  circuit  will  start  consuming  power  even  when  not 
switching.  The  decrease  in  gain  causes  the  transistors  to  become  harder  to  switch.  After  extended 
exposure  to  radiation,  the  circuit  will  cease  to  function  [51]. 

The  main  source  of  degradation  comes  from  the  interaction  of  ionizing  radiation  with  the  gate  and 
field  oxides  (Si02)  in  the  device  structure.  The  gate  oxide  is  a  thin  high-quality  oxide  used  to 
insulate  the  gate  contact  from  the  transistor  channel.  The  field  oxide  is  a  thick  low-quality  oxide 
used  to  isolate  metal  traces  from  one  another  [49]. 

Ionizing  radiation  causes  the  formation  of  electron-hole  pairs  in  the  gate  oxide.  Electrons  have  a 
much  higher  mobility  than  holes  in  Si02  and  are  attracted  to  and  swept  out  of  the  gate  in  a  nMOS 
transistor.  The  holes  become  trapped  and  migrate  toward  the  transistor  channel.  This  results  in  the 
eventual  buildup  of  positive  charge  above  the  transistor  channel  and  acts  like  the  charge  that  is 
present  when  voltage  is  applied  at  the  gate.  As  more  charge  is  trapped,  the  voltage  threshold  of  the 
nMOS  transistor  becomes  more  negative,  which  means  it  becomes  easier  to  turn  on.  With  enough 
shift  in  threshold  voltage,  the  transistor  will  be  turned  on  without  a  gate  voltage  applied. 
Conversely,  a  pMOS  transistor  becomes  more  difficult  to  turn  on.  Figure  14  shows  how  the  gate 
voltage  versus  drain  current  curve  changes  resulting  from  exposure  to  radiation  in  an  nMOS 
transistor  [49]. 


Figure  14.  Total  Ionizing  Dose  Effect  on  nMOS  Threshold  Shift  [49] 
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The  field  oxide  also  traps  charge  due  to  ionizing  radiation.  The  trapped  positive  charge  along  the 
edges  of  the  nMOS  transistor  creates  a  leakage  channel.  Leakage  paths  can  also  form  between 
transistors  through  the  field  oxide.  This  constant  leakage  contributes  to  increased  power 
consumption  [49].  Figure  15  illustrates  how  a  circuit  exposed  to  a  radiation  environment  slowly 
increases  power  consumption  and  reduces  the  operating  frequency.  Eventually,  the  circuit  will 
cease  functioning  when  the  power  required  by  the  degraded  electronics  exceeds  the  output 
capability  of  the  power  supply.  Premature  failure  can  also  occur  when  the  output  voltage  swing  of 
the  transistors  becomes  insufficient  to  drive  successive  stages  or  when  the  timing  is  degraded  to 
the  point  where  the  circuit  does  not  operate  properly. 
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Figure  15.  Total  Ionizing  Dose  Response  of  Maximum  Frequency  and  Supply  Current  [49] 

When  a  high-energy  particle  passes  through  a  circuit  and  causes  a  disruption  in  circuit  operation, 
it  is  classified  as  a  single  event  effect  (SEE).  For  example,  a  proton  or  ion  passing  through  a  latch 
could  change  the  value  of  a  stored  bit.  This  event  is  called  a  single  event  upset  (SEU).  Protons  and 
high-energy  heavy  ions  typically  cause  SEUs.  Space  vehicles  passing  through  the  South  Atlantic 
anomaly,  where  there  is  a  high  concentration  of  protons,  can  experience  SEU  activity  in  that 
region.  These  particles  create  a  temporary  presence  of  an  abundance  of  free  carriers  in  the 
transistor  channel  region.  The  free  carriers  in  effect  turn  the  channel  on. 

If  a  channel  is  turned  on  in  a  combinational  logic  circuit,  the  effect  is  seen  as  a  glitch  in  a  data  or 
control  line,  which  normally  does  not  affect  system  operation  unless  the  glitch  occurs  during  a 
clock  transition.  However,  if  a  channel  is  turned  on  that  is  part  of  a  memory  structure,  such  as  a 
latch,  it  can  upset  the  state  of  the  latch.  Upset  can  only  occur  if  enough  carriers  are  present  in  the 
transistor  channel  to  turn  it  on  strongly  enough  to  change  the  state  of  the  latch.  SEU  can  be 
corrected  by  refreshing  memory  locations  on  a  periodic  basis. 
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Another  effect  seen  in  CMOS  is  single  event  latchup  (SEL).  SEE  describes  the  phenomenon  that 
occurs  when  inactive  parasitic  transistor  regions  (pnpn  structure)  are  turned  on  by  a  high-energy 
particle.  These  pnpn  regions  are  formed  in  CMOS  layouts  due  to  the  close  placement  of  nMOS 
and  pMOS  transistors  and  have  the  characteristics  of  a  silicon  controlled  rectifier  (SCR).  If  a 
particle  with  enough  energy  passes  through  the  controlling  pn  junction  of  the  SCR,  it  can  switch 
the  SCR  on.  The  only  way  to  turn  the  SCR  off  is  with  a  power  cycle. 

Radiation  tolerance  to  total  ionizing  dose  and  single  event  effects  can  be  achieved  through  layout. 
A  radiation  tolerant  inverter  is  shown  in  Figure  16.  Total  ionizing  dose  effects  are  minimized  by 
the  use  of  annular  geometry  nMOS  transistors.  This  geometry  minimizes  the  threshold  voltage 
shift  preventing  the  buildup  of  trapped  charge  near  the  active  region  and  eliminates  edge  leakage. 
The  transistors  are  surrounded  with  highly  doped  guard  rings,  which  prevent  leakage  through  the 
field  oxide  separating  the  transistors  and  nearly  eliminate  SEE.  The  inherent  increased  drive 
strength  (width)  of  the  transistors,  due  to  meeting  minimum  design  rules  for  the  annular  nMOS 
then  balancing  with  pMOS,  increases  the  SEU  threshold  and  reduces  SET.  Additional  redundancy 
techniques  can  be  applied  where  higher  SEE  hardness  is  required. 


Ground  nMOS  transistor  Input 


pMOS  transistors  Power 


p+  Guard  Ring  Output  n+  Guard  Ring 


Figure  16.  RHBD  Layout  of  au  luverter  aud  Key  Features 
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Numerous  RHBD  efforts  have  demonstrated  considerable  radiation  hardness.  For  example,  a 
recent  design  and  test  campaign  in  0.25  pm  CMOS  achieved  these  results,  which  far  exceed 
envisioned  mission  requirements  [52]: 

•  TID  >  1  MRad(Si) 

•  SEL  >110  MeV-cmVmg  @  125  °C  (no  latch-up) 

•  SEU  <  1x10-12  errors/bit-day  @  2.25V 

Despite  the  many  advantages  of  this  relatively  straightforward  approach  to  mitigating  radiation 
effects,  there  are  two  primary  drawbacks.  First,  the  basic  sea  of  gates  or  gate  array  approach  does 
not  lend  itself  to  compact  designs,  so  there  is  a  significant  area  penalty.  Also,  there  is  a  power 
penalty,  as  the  transistor  length  is  much  longer  than  the  minimum  size  for  the  technology. 


4.2  Radiation  Hardened  Library  Design 

A  digital  cell  library  is  designed  for  the  AMS  S35  process  (FIITKIT  3.70)  in  the  Cadence  DFII 
framework  (2006-2007  5.1.41).  A  simplified  overview  of  the  development  process  is  presented  in 
Table  5.  It  should  be  noted  that  each  step  involved  a  significant  time  investment  due  to  the 
required  learning  curve  of  the  complex,  yet  powerful,  commercial  tools  involved.  The  most 
simple  cell  in  the  library  is  the  INVO,  shown  in  Figure  16,  illustrates  clearly  the  design  and 
features  of  the  nmos4  and  pmos4  pcells  discussed  in  steps  two  and  three  in  Table  5.  A  complete 
list  of  cells  required  to  complete  all  designs  are  listed  in  Table  6  and  Table  7. 


Table  5.  Radiation  Hardened  Library  Design  Development  Process 


Step 

Tool 

Action 

1 

Library  Manager 

Copy  CORELIB,  GATES,  lOLIB,  and  PRIMLIB  to  *_RHBD 

2 

Virtuoso  (Pcell) 

Create/compile  nmos4  and  pmos4  pcells  in  PRIMLIB  RHBD 

3 

CDF 

Edit  descriptions  of  nmos4  and  pmos4  in  PRIMLIB  RHBD  to  match 

4 

Virtuoso  (Schematic) 

Verify/update  width  and  length  parameters  in  GATES  RHBD 

5 

Virtuoso  (Schematic) 

Design  synthesis  to  Layout  XL 

6 

Virtuoso  (XL) 

Manually  place  and  route  pcells,  label  terminals 

7 

Assura 

Copy/edit  extract.rul  file  to  extract  annular  nmos  properly 

8 

Assura  (DRC) 

Run  design  rule  check,  correct  errors  as  needed 

9 

Assura  (LVS) 

Run  layout  versus  schematic,  ensure  designs  match 

10 

Assura  (RCX) 

Run  parasitic  extraction  and  verify  av  extracted  view 

11 

DFII  (Export  Stream) 

Create  gdsll  files  from  layout  view 

12 

Library  Manager 

Create  functional  (Verilog) 

13 

Abstract  Generator 

Complete  abstract  generation  process  for  each  cell 

14 

Virtuoso  (Layout) 

Manually  convert  nmos  devices  in  lOLIB  to  equivalent  annular 

15 

Voltage  Storm* 

Characterize  and  create  timing  libraries  for  V erilog  and  Encounter 
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Table  6.  Radiation  Hardened  Library  Core  Cells 


Cell _ Description _ Standard  Size  (pm) _ RHBD  Size  (pm) 


AOI210 

2-Input  AND  into  2-Input  NOR 

5.6x13 

16.8x13 

AOI220 

2x2-Input  AND  into  2-Input  NOR 

7x13 

22.4x13 

AOI310 

3-Input  AND  into  2-Input  NOR 

7x13 

22.4x13 

BUF2 

Buffer 

4.2x13 

11.2x13 

DEI 

D  Flip  Flop 

21x13 

67.2x13 

DFCl 

D  Flip  Flop  w/active  low  clear 

23.8x13 

78.4x13 

DFPl 

D  Flip  Flop  w/active  low  preset 

23.8x13 

78.4x13 

INVO 

Inverter 

2.8x13 

5.6x13 

MUX21 

2:1  Multiplexer 

8.4x13 

33.6x13 

NAND20 

2-Input  NAND 

4.2x13 

11.2x13 

NAND30 

3-InputNAND 

5.6x13 

16.8x13 

NAND40 

4-Input  NAND 

7x13 

22.4x13 

NOR20 

2-Input  NOR 

4.2x13 

11.2x13 

NOR30 

3 -Input  NOR 

5.6x13 

16.8x13 

NOR40 

4-Input  NOR 

7x13 

22.4x13 

OAI210 

2-Input  OR  into  2-Input  NAND 

5.6x13 

16.8x13 

XOR20 

2-input  XOR 

9.8x13 

28x13 

TIEO/1 

Tie  lo  and  hi  logic 

2.8x13 

5.6x13 

Fill  cells 

Fill  cells  for  SOC  Encounter 

Various 

Various 

Table  7.  Radiation  Hardened  Library  Inpnt/Ontpnt  Cells 

Cell 

Description 

Standard  Size  (pm) 

RHBD  Size  (pm) 

BBCIP 

1  mA  bi-directional  pad 

95x334 

same 

BUIP 

1  mA  output  buffer 

95x334 

same 

ICP 

Input  buffer 

95x334 

same 

4.2.1  Asynchronous  Logic  Background 

Traditional  synchronous  circuit  designs  feature  a  global  clock  that  drives  latches  surrounding 
combinational  logic,  which  as  a  system,  performs  a  particular  function.  The  clock  rate  is 
determined  by  the  critical  path  through  the  system.  This  approach  has  remained  an  industry 
standard  largely  due  to  the  entrenched  design  flow,  which  includes  design  synthesis  from 
hardware  description  languages  (HDLs).  However,  synchronous  designs  have  periodic  power 
peaks,  which  produce  EMI.  Additionally,  the  global  clock  tree  consumes  a  significant  fraction  of 
the  required  power. 

Asynchronous  SoC  architecture,  which  offers  numerous  advantages,  has  only  recently  been 
considered  by  this  niche  community  [53].  Typically,  asynchronous  implementations  can 
potentially  require  a  fraction  of  the  power  of  their  clocked  counterparts  and  produce  very  little 
electromagnetic  interference  (EMI).  Asynchronous  designs  are  event  triggered,  processing  new 
data  using  the  minimum  number  of  gate  transitions  possible.  Asynchronous  SoC  design  also 
promises  to  solve  the  global  clock  delay  problem,  which  increases  as  the  size  of  SoCs  grow  with 
increased  functionality  and  performance. 
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Asynchronous  logic  concepts  have  existed  since  the  1950’s,  offering  potential  power  savings  and 
performance  improvements  depending  on  the  application  [54].  Analogous  to  RHBD’s  shortfalls  in 
power  and  area  penalties,  asynchronous  logic  design  is  more  complex  when  compared  to  the 
synchronous  commercial  standard  and  carries  a  potential  area  penalty.  However,  recent  advances 
in  automating  the  asynchronous  design  process  have  made  the  idea  more  attractive,  resulting  in 
new  commercial  offerings. 

Asynchronous  designs  work  on  the  concept  of  modular  functional  blocks  with 
intercommunication  using  handshaking  protocols.  The  overall  function  of  the  circuit  resembles 
that  of  the  synchronous  one.  Recently,  considerable  progress  has  been  made  to  improve  the  design 
automation  of  this  particular  asynchronous  characteristic,  complete  with  a  new  term,  de- 
synchronization  [55]. 

However,  de-synchronization  does  not  yet  realize  all  the  potential  advantages  of  asynchronous 
logic.  Although  removing  the  global  clock  tree  and  replacing  it  with  a  fabric  of  handshaked 
interconnections  does  flatten  the  power  spectrum  and  reduce  EMI  generation,  it  is  generally 
accepted  that  the  opportunity  is  missed  to  significantly  lower  the  energy  requirements  and 
improve  the  performance.  This  can  be  achieved  by  recognizing  that  most  synchronous  circuits 
often  have  redundant  operations  depending  on  the  system  state  and  that  not  all  operations  take  the 
same  amount  of  time.  Unfortunately,  automating  this  process  has  not  been  achieved  due  to  the 
variety  of  power  and  latency  reduction  techniques  that  can  be  applied,  and  each  one  design 
dependent. 

A  custom  design  approach  was  chosen  for  this  work  to  demonstrate  the  best-possible  benefits  of 
asynchronous  logic,  leveraging  the  assumption  that  others  are  continuing  to  improve 
asynchronous  design  automation.  The  paragraphs  to  follow  describe  the  general  asynchronous 
design  methodologies  used  in  this  work.  The  next  section  discusses  the  integration  of  the  RHBD 
and  asynchronous  design  concepts  and  presents  the  comparative  results. 

The  asynchronous  building  blocks  used  in  this  effort  fall  into  four  typical  categories,  briefly 
reviewed  in  the  following  paragraphs  [56].  The  fundamental  mode  bounded  delay  methodology  is 
used  for  blocks  with  relatively  fixed  completion  times.  The  delay  insensitive  design  methodology 
applies  to  functional  blocks  with  widely  varying  completion  times.  Burst  mode  design 
methodology  applies  to  components  that  serve  as  controllers  or  asynchronous  finite  state 
machines  (AFSMs).  Finally,  the  speed  independent  model  specifies  the  handshaking  protocols 
between  major  functional  blocks. 

The  fundamental  mode  bounded  delay  methodology  was  used  for  functional  blocks  that  had  little 
variation  in  completion  time,  such  as  a  latch.  This  methodology  assumes  that  the  delay  time 
through  a  functional  block  is  known  and  constant.  Worst-case  delay,  with  a  margin  of  safety,  is 
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used  similar  to  a  clocked  circuit.  Difficulty  arises  in  synthesizing  this  structure  since  timing 
information  cannot  be  synthesized  from  behavioral  HDL,  but  can  be  back-annotated  from  layout 
simulations.  Figure  17  illustrates  a  delay  element  used  to  model  the  latch  completion  time.  An 
acknowledge  (ACK)  signal  is  asserted  when  the  data  is  latched  after  the  request  (REQ)  is 
generated. 


Figure  17.  Fundamental  Mode  Bounded  Delay  Applied  to  a  Latch 

A  delay  element  is  not  suitable  for  functional  blocks  with  widely  varying  completion  times,  since 
the  average  critical  path  latency  can  be  much  lower  than  the  synchronous  counterpart.  Additional 
logic  can  be  added  to  this  type  of  block  to  detect  when  its  execution  is  complete.  Synthesis  tools 
do  not  yet  have  the  ability  to  generate  the  completion  detection  circuit  for  a  particular  functional 
block,  such  as  a  basic  add/subtract  unit,  shown  in  Figure  18. 


Figure  18.  Oue-Blt  Adder  without  Completlou  Detectlou 

A  dual-rail  adder  scheme  similar  to  the  Manchester  adder  can  be  used  to  implement  completion 
detection  [57].  The  dual  rail  adder  works  on  the  principle  that  each  stage  will  have  either  a  carry 
out  (COUT)  or  no  carry  out  (NOCOUT)  condition  based  on  the  inputs  to  the  stage.  Adding  0  and 
0  will  never  result  in  a  carry  out,  even  if  there  is  a  carry  in.  Likewise,  adding  1  and  1  will  always 
result  in  a  carry  out,  even  if  there  is  a  carry  in  of  0.  Therefore,  the  carry  condition  in  these  cases 
can  be  determined  by  the  data  to  be  summed  alone  and  gives  early  completion  detection.  Adding  a 
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0  and  1  or  1  and  0  may  or  may  not  have  a  carry  out  depending  on  the  carry  in  condition.  In  this 
case,  the  stage  must  wait  for  either  a  carry  in  (CIN)  or  no  carry  in  (NOCIN)  value.  The  end  result 
is  the  completion  detection  circuit  simply  becomes  the  NOR  of  the  COUT  and  NOCOUT  values. 
Whenever  one  of  these  conditions  exist,  it  indicates  that  all  input  values  necessary  for  evaluating 
the  sum  are  present  and  DONE  is  asserted.  A  design  with  improved  throughput  is  shown  in  Figure 
19. 


Figure  19.  One-bit  Adder  with  Completion  Detection 

The  burst  mode  design  methodology  is  used  to  design  asynchronous  controllers  or  finite  state 
machines.  Synchronous  finite  state  machines  are  easily  synthesized  by  using  latches,  flip-flops 
and  clock  circuitry.  Asynchronous  controllers  or  AFSMs  must  be  synthesized  using  specialized 
design  tools,  such  as  3D  [58]. 

A  user-specified  state  table  of  entry  and  exit  conditions  for  the  state  machine  is  provided  to  3D. 
An  example  state  table  is  shown  in  Table  8  for  a  Johnson  counter  (00— >01^1 1— >10).  3D  converts 
the  state  table  to  positive  logic  equations.  These  equations  are  then  manually  converted  into 
behavioral  HDF.  A  logic  synthesizer  (with  structuring  and  Boolean  optimization  disabled)  can  be 
used  to  convert  the  positive  logic  behavioral  HDF  into  negative  logic  structural  HDF.  After  the 
structural  HDF  is  generated,  reset  circuitry  and  corrections  for  fanout  are  added  manually  to  the 
controller  circuit.  The  final  two-bit  Johnson  counter  circuit  is  shown  in  Figure  20,  which  includes 
the  reset  circuit. 
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Table  8.  3D  State  Table  of  a  Two-Bit  Johnson  Counter 

Present  State 

Next  State 

Entry  Conditions 

Exit  Conditions 

0 

1 

COIJNT+ 

1 

2 

COUNT- 

BIT0+ 

2 

3 

COUNT+ 

3 

4 

COUNT- 

BIT1+ 

4 

5 

COUNT+ 

5 

6 

COUNT- 

BIT1+ 

6 

7 

COUNT+ 

BITO- 

7 

0 

COUNT- 

BITl- 

Figure  20.  Gate  Level  Schematic  of  a  Synthesized  Two-Bit  Johnson  Counter 

Depending  on  the  complexity  of  the  AFSM,  3D  may  not  be  able  to  synthesize  the  controller.  The 
controller  must  then  be  broken  down,  using  Shannon  decomposition,  and  resynthesized.  The  two- 
bit  Johnson  counter  example  is  used  to  illustrate  how  asynchronous  synthesis  tools  work,  but  it 
highlights  how  automated  AFSM  synthesis  does  not  always  produce  the  most  elegant  solution 
[59].  A  simpler  implementation  (but  nearly  the  same  transistor  count)  of  the  two-bit  Johnson 
counter  is  accomplished  by  using  two  D-registers,  as  shown  in  Figure  21.  It  is  also  important  to 
note  that  the  advantage  of  the  Johnson  counter  is  the  fact  that  it  changes  only  one  bit  each  clock 
cycle,  avoiding  possible  data  hazards. 
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Figure  21.  Improved  Two-Bit  Johnson  Counter 

Functional  blocks  in  an  asynchronous  design  must  have  a  standard  handshaking  protocol  in  order 
to  interface  with  other  blocks.  A  generic  functional  block  in  an  asynchronous  design  is  shown  in 
Figure  22.  The  REQIN  signal  represents  the  external  request  to  the  block  to  input  new  data.  The 
ACKIN  signal  is  asserted  when  the  new  input  data  is  fully  latched  or  accepted.  The  REQOUT 
signal  represents  the  request  of  the  functional  block  to  send  processed  data  out.  The  ACKOUT 
signal  is  the  external  acknowledgement  from  the  next  block  that  the  processed  data  was  latched  or 
accepted. 


REQIN 

REQOUT 

FUNCTIONAL 

BLOCK 

ACKIN 

ACKOUT 

Figure  22.  Asynchronous  Functional  Block 

The  speed  independent  methodology  describes  two  standards  for  handshaking  between 
connecting  blocks.  It  does  not  assume  any  pre-defmed  delays  but  relies  on  a  set  of  handshaking 
signals  between  the  blocks.  The  two-phase  model  is  illustrated  in  Figure  23.  It  is  a  scheme  that 
senses  signal  transitions  to  complete  the  handshake  cycle.  The  first  exchange  is  signaled  by  a  low 
to  high  transition  on  REQ  (1).  ACK  (2)  responds  by  acknowledging  the  request.  The  second  cycle 
uses  the  complementary  set  of  transitions  to  complete  the  cycle. 


REQ 


ACK 


1  1 


Cycle  1 


2 


Cycle  2 


Figure  23.  Asynchronous  Two-phase  Handshaking  Model 
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The  four-phase  model  is  illustrated  in  Figure  24.  It  has  a  four-cycle  handshake  for  each  data 
exchange.  Although  the  four-phase  model  appears  to  be  more  difficult  to  implement,  its  detection 
circuit  is  actually  smaller  than  the  two-phase  mode.  The  four-phase  model  is  the  primary  interface 
standard  used  throughout  this  design. 


ACK 


3 


2 


Cycle  1 


1  3 


Cycle  2 


Figure  24.  Asynchronous  Four-phase  Handshaking  Model 


4.2.2  Case  Study  of  RHBD  and  Asynchronous  Logic  Synergy 

The  basic  idea  behind  this  case  study  was  to  demonstrate  the  advantages  of  using  RHBD  and 
asynchronous  together.  Although  area  is  sacrificed,  the  hope  was  that  these  techniques  would 
offer  higher  performance,  a  flatter  power  spectrum,  and  similar  energy  consumption  when 
compared  to  a  synchronous  design.  Although  an  obscure  application,  this  approach  is  not 
completely  novel,  as  various  elements  have  been  presented  before  in  [60]  and  greatly  expanded 
upon  in  [61]  with  test  results  in  [62].  Unfortunately,  these  efforts  failed  to  influence  the 
community  due  to  the  lack  of  a  convincing  case  study,  which  is  the  purpose  of  this  work. 

It  should  be  noted  that  other  approaches  have  been  investigated  for  space  applications  of 
asynchronous  logic.  For  example,  fault  tolerance  and  deadlock  have  been  addressed  by  works 
such  as  [63]-[65].  These  approaches  focus  on  logic  gate  and  circuit  level  redundancy  techniques 
to  improve  SEU  hardness.  However,  they  exclude  TID  and  SEE  considerations,  which  are 
mitigated  through  RHBD.  However,  they  can  be  used  in  addition  to  RHBD  for  mission  critical 
applications  in  very  harsh  radiation  environments. 

To  make  a  convincing  argument,  a  common  design  is  implemented  in  three  ways:  synchronous 
with  a  commercial  cell  library,  synchronous  with  a  RHBD  cell  library,  and  asynchronous  with  the 
same  RHBD  library.  The  textbook  “MIPS”  multi-cycle  microprocessor  architecture  is  used  as  the 
baseline  design  as  illustrated  in  Figure  25  (adapted  from  Fig.  5.28  [66]).  To  keep  the  size  small 
and  affordable,  a  16-bit  fixed-point  4-register  variant  (versus  32-bit  floating  point  32-register)  is 
implemented  with  a  simplified  instruction  set  shown  in  Table  9.  The  functional  block  descriptions 
are  given  in  Table  10. 
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Figure  25.  MIPS  Conceptual  Block  Diagram 


Table  9.  Simplified  MIPS  Instruction  Set 


Instruction 

Meaning 

1 6-bit  Instruction 

Cycles 

add 

rd  =  rt  +  rs 

OOOOrsrtrdOOOOOO 

4 

subtract 

rd  =  rt  -  rs 

OOOOrsrtrdOOOOlO 

4 

logical  AND 

rd  =  rt  (bitwise  and)  rs 

OOOOrsrtrdOOOlOO 

4 

logical  OR 

rd  =  rt  (bitwise  or)  rs 

OOOOrsrtrdOOOlOl 

4 

set  on  less  than 

set  rd  =  1  if  rt  <  rs 

OOOOrsrtrdOOlOlO 

4 

load  word 

rt  =  mem[rs  +  addressx] 

OOOIrsrtaddressx 

5 

store  word 

mem[rs  +  addressx]  =  rt 

00 1  Orsrtaddressx 

5 

branch  on  equal 

if  rs  =  rt  go  to  addressx 

001  Irsrtaddressx 

3 

jump 

jump  to  addressx 

0100000000000000 

3 

The  entry  of  the  baseline  synehronous/standard  cell  design  into  Cadence  is  outlined  in  Table  10. 
The  baseline  design  is  then  copied  and  renamed  as  the  synchronous/RHBD  variant.  The 
synchronous/RHBD  variant  is  simply  modified  by  using  a  global  search  and  replace  of  the  cell 
library  name,  beginning  at  step  14  of  Table  10.  Steps  15-22  were  repeated  to  complete  the  design. 
Both  synchronous  variants  were  submitted  for  fabrication  on  AMS  S35  run  1725.  The  final  layout 
and  micrograph  of  the  fabricated  chips  are  shown  in  Figure  26  and  Figure  27. 
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Table  10.  Cadence  Design  Flow 

Step 

Tool 

Build  Action(s) 

1 

Library  Manager 

New  design  library 

2 

Virtuoso  (Schematic) 

16-bit  multiplexors  (MUX):  2:1,  3:1,  4:1 

3 

Virtuoso  (Schematic) 

Arithmetic  Eogic  Unit  (AEU)  basic  block:  1  -bit  add/sub 

4 

Virtuoso  (Schematic) 

16-bit  AEU  blocks:  add/sub,  and,  or,  sit,  zero  detect 

5 

Virtuoso  (Schematic) 

Top-level  AEU 

6 

Virtuoso  (Schematic) 

AEU  control  (AEU  C) 

7 

Virtuoso  (Schematic) 

16-bit  registers:  Program  Counter  (PC),  Memory  Data 

Register  (MDR),  Instruction  Register  (IR),  A,  B, 

AEUOut  (AO) 

8 

Virtuoso  (Schematic) 

Flardwired  blocks:  Shift  Eeft  2  (SE2),  Sign  Extend  (SE), 

Four  (4),  Zero  (0) 

9 

Virtuoso  (Schematic) 

Top-level  register  fde  (3  registers  +  hardwired  0) 

10 

RTL  Compiler 

Synthesis  of  Control  block  from  Verilog  description 

11 

DFII  (Import  Verilog) 

Import  synthesized  logic  into  schematic 

12 

Virtuoso  (Schematic) 

Top-level  MIPS 

13 

NC- Verilog 

Verilog  testbench  of  all  instructions  with  accurate  timing 

14 

Virtuoso  (Schematic) 

Top-level  chip  (adding  I/O  pads) 

15 

NC- Verilog 

Reverify  testbench,  export  netlist 

16 

RTL  Compiler 

Pass-through  of  netlist  to  satisfy  SOC  Encounter  format 

17 

SOC  Encounter 

Import  netlist,  place  I/O  and  core,  route,  clock  tree  synthesis 
(CTS),  export  netlist,  export  gdsii  stream 

18 

NC- Verilog 

Import  layout  netlist  to  schematic,  reverify  testbench 

19 

DFII  (Import  Stream) 

Import  gdsii  stream  to  layout 

20 

Virtuoso  (Layout) 

Inspect  layout 

21 

Assura 

Run  DRC,  EVS,  RCX 

22 

UltraSim 

Run  full-chip  simulation,  compare  results  with  Verilog 

23 

DFII  (Export  Stream) 

Export  gdsii  fde  for  fabrication,  submit  design 
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Figure  26.  Synchronous/Commercial  Layout  aud  Micrograph  (400x400p,m  Core) 

Notice  the  four  test  structures  in  the  micrograph  in  Figure  26.  Three  of  the  structures  are  basic 
RHBD  structures  intended  for  use  at  a  micro  probe  station:  nMOS,  pMOS,  and  an  inverter.  The 
fourth  test  structure  is  in  the  upper  right  hand  corner,  which  is  a  small  bank  of  photocells  with  the 
same  initial  design  structure. 


Figure  27.  Syuchrouous/RHBD  Layout  aud  Micrograph  (700x700pm  Core) 

Recall  that  the  RHBD  library  is  a  layout  modification  only  of  the  AMS  HIT  KIT  3.70.  The 
original  thought  was  to  use  Signal  Storm  to  generate  HDL  and  timing  libraries.  However,  this  idea 
was  abandoned  due  to  realizing  that  this  approach  would  result  in  reduced  drive  strength  during 
the  various  optimization  stages.  To  maintain  radiation  hardness  to  SEU  and  SET  particularly, 
keeping  the  drive  strength  and  fanout  ratios  at  the  same  proportion  to  the  standard  cell  library  is 
required.  Therefore,  the  best  approach  is  to  use  the  standard  cell  timing  libraries. 
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There  are  some  minor  differences  between  the  two  designs  just  presented,  regarding  the  RHBD 
cell  library.  Due  to  time  constraints,  the  RHBD  library  does  not  have  the  full  array  of  buffer  and 
inverter  cells  that  are  used  during  clock  tree  synthesis  (CTS).  However,  the  CTS  process 
compensated  for  this  appropriately,  as  the  sum  of  the  transistor  widths  is  the  same.  In  addition,  the 
I/O  cells  are  the  unmodified  commercial  version,  also  due  to  time  constraints.  This  does  not  affect 
the  simulation  or  hardware  results  significantly,  as  the  nMOS  transistor  widths  are  equivalent. 

The  final  design  in  the  case  study  is  an  asynchronous/RHBD  variant.  Asynchronous  logic  offers 
potential  power  savings  and  performance  improvements  with  a  tradeoff  in  design  complexity  and 
usually  small  area  penalty.  In  its  purest  form,  this  circuit  design  approach  aims  to  minimize 
transistor  switching.  Due  to  the  variety  of  circuit  types  and  implementation  techniques,  the  design 
process  can  be  quite  complex. 

The  unpipelined  MIPS  architecture  may  not  be  the  best  for  demonstrating  dramatic  power 
reductions,  but  it  does  offer  the  observer  direct  insight  to  the  design  process.  For  example,  it  does 
not  make  sense  to  break  down  the  architecture  into  smaller  blocks  where  handshaking  can  be 
applied.  Instead,  the  MIPS  circuit  should  be  thought  of  as  a  design  block  in  a  larger  asynchronous 
SoC,  as  in  the  envisioned  sensor  node  architecture.  The  external  interface  of  the  asynchronous 
MIPS  implementation  is  shown  in  Figure  22  with  four-phase  handshaking  as  in  Figure  24. 

Several  asynchronous  design  methodologies  are  applied  to  the  synchronous  MIPS  architecture. 
This  approach  is  not  to  be  confused  with  de-synchronization  as  defined  in  [55],  but  rather  a 
unique  focus  on  overall  power  reduction  and  flattening  the  power  spectrum.  The  global  clock  is 
removed,  but  instead  of  replacing  the  flip-flops  with  master-slave  latches  and  delay  elements  as  in 
de-synchronization,  a  phased  sequence  of  latching  with  tailored  delay  elements  is  carefully 
applied  across  the  latches  and  multiplexers  in  the  data  path,  as  shown  in  Figure  28.  Care  is  taken 
to  ensure  a  hazard-free  sequence  and  no  double-switching  of  elements.  The  synchronous  FSM 
control  block  is  improved  to  minimize  latching  of  the  MDR  and  ALUOut  registers.  Additionally, 
a  form  of  clock  gating  is  applied  within  all  registers,  which  allows  the  use  of  basic  D-latches 
without  enables.  This  also  requires  latches  to  be  placed  on  all  control  signals  and  phased  in  as 
appropriate.  The  applied  approaches  are  summarized  in  Table  11. 


Table  11.  Asynchronous  Design  Approaches  Implemented 


Step 

Action 

Benefit 

1 

Remove  global  clock 

Overhead  of  CTS  eliminated,  power  reduced 

2 

Add  phased  latching  sequence 

Flattens  power  spectrum 

3 

Add  delays  within  registers 

Further  flattens  power  spectrum 

4 

Improve  MIPS  control 

Eliminates  redundant  latching,  power  reduced 

5 

Add  clock  gating 

Power  reduced 

6 

Remove  unused  inverting  outputs 

Power  and  area  reduced 
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REQJN— Delay^)-y»(  Delay  Dela7^)-2~W^Delay  )-^-»-REQ_OUT 


Figure  28.  Phase-Latched  Asynchronous  Approach 

The  custom  re-design  of  most  elements  in  the  MIPS  architecture  just  discussed  affects  all  steps  in 
Table  10.  Most  notable,  CTS  and  optimization  are  prevented  in  step  17.  The  asynchronous/RHBD 
variant  was  fabricated  on  AMS  S3  5  run  1791.  The  final  layout  and  fabricated  die  micrograph  is 
shown  in  Figure  29. 


Figure  29.  Asynchronous/RHBD  Layout  and  Micrograph  (720x720pm  Core) 
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4.3  Comparison,  Simulation,  and  Test  Results 

A  common  test  bench  is  used  for  NC-Verilog  simulation,  UltraSim  simulation,  and  hardware 
testing  using  National  Instruments  Digital  Waveform  Editor  and  LabView.  NC-Verilog  is  a 
functional  simulator  that  uses  back-annotated  timing  information  for  each  element.  Simulation 
results  are  available  immediately.  UltraSim  is  based  on  Spice,  as  it  uses  extracted  parameters  for  a 
more  accurate  simulation,  but  uses  a  proprietary  algorithm  to  allow  for  full-chip  simulations  in  a 
reasonable  amount  of  time.  For  example,  most  of  the  lull-chip  simulations  take  around  one  hour 
to  run,  versus  days  for  this  size  of  design  on  Spice  or  HSpice.  The  UltraSim  results  are  advertised 
to  be  within  5%  of  Spice.  The  test  bench  is  shown  in  Table  12,  indicating  expected  output  data 
(DATA  OUT)  and  expected  address  (ADDR)  based  on  the  instruction  and  data  mix  given  to  the 
microcontroller  (DATA  IN). 

Table  12.  Common  Test  Bench  Inclnding  Expected  Resnlts 


DATA  IN 

Expected  DATA  OUT 

Expected  ADDR 

load  R1  from  address  0x0001 

0x0000 

OxFFFF 

0x0001 

load  R2  from  address  0x0002 

0x0004 

0x0001 

0x0002 

R3  =  R1  -t  R2 

0x0008 

store  R3  to  address  0x0000 

0x0000 

OxOOOC 

R3  =  R1  -  R2 

0x0010 

store  R3  to  address  0x0000 

OxFFFE 

0x0014 

R3  =  R1  (bitwise  and)  R2 

0x0018 

store  R3  to  address  0x0000 

0x0001 

0x00 1C 

R3  =  R1  (bitwise  or)  R2 

0x0020 

store  R3  to  address  0x0000 

OxFFFF 

0x0024 

R3  =  R1  <  R2 

0x0028 

store  R3  to  address  0x0000 

0x0001 

0x002C 

branch  if  R1  =  R2 

0x0030 

load  R2  from  address  0x0002 

0x0034 

OxFFFF 

0x0002 

R3  =  R1  <  R2 

0x0038 

store  R3  to  address  0x0000 

0x0000 

0x003C 

branch  if  R1  =  R2 

OxFEEC 

jump  to  0 

OxCOOO 

Figure  30  illustrates  an  example  of  the  complete  NC-Verilog  simulation  testbench.  Figure  31  is 
the  result  of  the  UltraSim  simulation,  which  matches  with  the  expected  results  and  the  NC- 
Verilog  simulation.  Figure  32  is  the  results  of  the  FabView  hardware  results.  Either  the  inputs  or 
outputs  can  be  shown  in  FabView  at  one  time,  so  the  output  values  are  shown,  indicating  that  the 
fabricated  test  chip  performs  functionally  as  expected.  The  simulation  and  hardware  outputs  for 
the  synchronous/commercial  gate,  synchronous/RHBD,  and  asynchronous/RHBD  designs  are  all 
very  similar  and  are  not  repeated.  The  maximum  frequency  of  all  designs  is  16.67  MHz  in 
simulation,  but  unfortunately,  the  hardware  test  platform  only  operates  up  to  12.5  MHz. 
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Figure  30.  Example  NC-Verilog  Simulation  Testbench  Output 


Figure  31.  Example  UltraSim  Testbench  Output 
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Figure  32.  Example  Lab  View  Hardware  Testbench  Output 
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Although  correct  functionality  is  important  to  verify,  the  most  important  aspect  of  this  comparison 
is  the  power  performance.  NC-Verilog  is  not  able  to  report  on  power  consumption.  Figure  33 
illustrates  an  example  power  output  for  the  entire  testbench  and  a  closeup  of  an  example  single 
clock  cycle  in  UltraSim.  Figure  34  illustrates  the  power  spectrum  of  the  three  designs,  left  to  right. 
Note  that  the  asynchronous  design  on  the  right  has  smoothest  profile  and  lowest  peaks. 


Com  m  ercial  Synchronous  Power  Spectrum 


Commercial  Synchronous  Power  Spectrum 


Figure  33.  Example  Simulation  Testbench  Power  Spectrum  in  UltraSim 


CHI  /  -2.24mV 
1.092$7kH2 


Figure  34.  Example  Hardware  Testbench  Power  Spectrum 

A  comparison  of  results  is  given  in  Table  13.  In  this  case  study  with  this  particular  design,  the 
application  of  RFIBD  resulted  in  a  200%  core  area  increase  from  the  baseline  design  and  required 
160%  more  energy  for  the  same  testbench  at  any  frequency,  as  determined  through  UltraSim 
simulations.  Figure  34  and  Figure  35  clearly  illustrate  that  all  the  asynchronous  approaches  taken 
to  reduce  the  power  and  smooth  the  power  spectrum  are  indeed  effective.  Figure  36  verifies  that 
the  final  hardware  results  correlate  nicely  with  the  predicted  simulation  results,  across  the  1.25  to 
12.5  MHz  test  points.  The  most  significant  result  is  that  the  asynchronous  approach  reduced  the 
energy  penalty  to  85%  (from  150%)  for  a  6%  area  increase  with  no  performance  impact. 


Table  13.  Comparison  of  Three  Design  Approaches 


Test  Chip 

Total  Transistor 
Width  (pm) 

Core  Area  (pm) 

Energy  (nJ) 
(UltraSim) 

synchronous/commercial  (sc) 

16,088 

400x400 

28 

synchronous/RHBD  (sr) 

60,450 

700x700 

71 

asynchronous/RHBD  (ar) 

55,973 

720x720 

51 

49 
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logic  ^imuUtion 

—  \_wdd!;p  —  \_wdd3r!:p  (1)  —  \_vvdd3r!;p  (2) 


Figure  35.  Single  Clock  Cycle  Comparison  in  UltraSim 


H  SC  (sim) 
♦  sc  (hw) 
— sr  (sim) 
— sr  (hw) 
— ar  (sim) 
— ar  (hw) 


Figure  36.  Comparison  of  Simulation  and  Hardware  Power  Consumption 
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5  Conclusions 

5.1  Summary  of  Results 

A  cost-effective  monolithic  system-on-a-chip  approach  is  under  investigation  to  fabricate  large 
numbers  of  wireless  sensor  nodes  for  hostile  environments  including  space.  Two  essential 
building  blocks  have  been  developed  and  reported  on:  integrated  solar  cells  in  CMOS  and 
radiation  hardening  by  design  of  asynchronous  logic. 

A  first-ever  design  for  integrated  solar  cells  in  commercial  SiGe  BiCMOS  is  presented.  Two 
prototype  designs  have  been  designed,  fabricated,  and  tested.  The  average  efficiency  of  the  first 
prototype  is  2.4%,  compared  to  an  estimated,  but  unverified  1%  from  previous  work.  The  actual 
efficiency  of  the  junction  is  8.3%,  without  considering  the  metallization  overhead.  An  improved 
design  demonstrates  3.44%  efficiency,  a  40%  improvement.  The  junction  efficiency  alone  is 
11.3%.  However,  power  from  these  first  two  prototypes  cannot  be  harnessed  properly  in  the 
current  implementation.  A  final  design,  overcoming  this  limitation,  has  been  submitted  for 
fabrication  and  will  be  reported  in  a  later  publication.  This  novel  development  has  potential 
widespread  application  to  a  rapidly  growing  number  of  SoC  designs. 

The  application  of  radiation  hardening  by  design  to  asynchronous  logic  is  suggested  as  a  unique 
approach  for  bare  die  SoC  implementations  in  hostile  environments.  A  case  study  is  presented 
using  a  common  design.  Starting  with  a  common  synchronous  microcontroller  design 
implemented  with  commercial  logic  gates,  the  application  of  RHBD  results  in  an  expected  200% 
core  area  increase  and  requires  160%  more  energy.  The  most  significant  result  is  that  the 
application  of  asynchronous  design  reduced  the  energy  penalty  to  85%  (from  160%)  for  a  6%  area 
increase  with  no  performance  impact.  Additionally,  electromagnetic  interference  is  greatly 
reduced.  This  approach  provides  environmental  tolerance  to  radiation  and  temperature  extremes. 


5.2  Future  Research  Directions 

A  suggested  next  step  would  be  the  monolithic  integration  of  the  developed  solar  cells  and 
microcontroller  with  a  single-chip  radio  design  and  simple  sensor.  The  focus  of  this  work  would 
be  to  minimize  or  eliminate  the  traditional  external  components  and  establish  self-powered 
wireless  interconnectivity.  To  date  this  complete  monolithic  approach  has  not  been  demonstrated 
in  the  literature  and  would  make  a  great  impact  on  a  number  of  technology  applications. 
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Due  to  lack  of  time  and  resources  this  project  could  not  address  two  other  important  aspects  of  the 
SoC  design  detailed  below,  which  should  be  undertaken  by  a  follow-up  project: 

System  Configuration — simple  configuration,  such  as  two  die  sandwiched  together,  could  help 
meet  power  and  thermal  requirements.  An  investigation  is  required  to  determine  the  material 
composition  and  minimal  packaging.  A  preliminary  design  has  been  reported  in  [46]. 

Stand-alone  Transceiver — ^All  SoC  transceivers  to  date  require  external  passive  devices,  precision 
frequency  oscillators,  and  antennas.  Research  is  needed  to  determine  if  a  very  simple  transceiver, 
perhaps  using  on-off  keying  (OOK)  modulation,  could  be  implemented  on  CMOS  without  any 
external  components.  However,  it  has  been  clearly  demonstrated  that  an  external  antenna  will  be 
required  to  achieve  any  meaningful  range. 
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